On the quantum, classical and total 
amount of correlations in a quantum state 
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We give an operational definition of the quantum, classical and total amount of correlations in 
a bipartite quantum state. We argue that these quantities can be defined via the amount of work 
(noise) that is required to erase (destroy) the correlations: for the total correlation, we have to erase 
completely, for the quantum correlation one has to erase until a separable state is obtained, and the 
classical correlation is the maximal correlation left after erasing the quantum correlations. 

In particular, we show that the total amount of correlations is equal to the quantum mutual 
information, thus providing it with a direct operational interpretation for the first time. As a 
byproduct, we obtain a direct, operational and elementary proof of strong subadditivity of quantum 
entropy. 



I. INTRODUCTION 

Landauer [lj , in analysing the physical nature of (clas- 
sical) information, showed that the amount of informa- 
tion stored, say, in a computer's memory, is proportional 
to the work required to erase the memory (reset to zero 
all the bits). These ideas were further developed by 
other researchers (most prominently Bennett) into a deep 
connection of classical information and thermodynamics 
(see for a recent survey). Here we follow Landauer 's 
idea in analysing quantum information: we want to mea- 
sure correlation by the (thermodynamical) effort required 
to erase (destroy) it. 

The main idea of our paper can be understood on a 
simple example. Consider a maximally entangled state 
of two qubits (equivalent to a singlet) 
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Usually this state is seen as containing 1 ebit, i.e., one 
bit of entanglement, based on the asymptotic theory of 
pure state entanglement The temptation is to think 
that it contains 1 bit of correlation, and that this corre- 
lation is in pure quantum form (which can be used either 
quantumly — e.g., for teleportation — or to obtain one 
perfectly correlated classical bit). 

We will argue however that this state contains in fact 
2 bits of correlation — 1 bit of entanglement and 1 bit 
of remaining (secret) classical correlations, as follows. 

Suppose that Alice wants to erase the entanglement 
between her bit and Bob's. She can do this by applying 
1 bit of randomness: she applies to her qubit one of two 
unitary transformations 1 or er z with equal probability. 
By this the pure state in eq. becomes a mixture 
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p= ii$+)($+i + ii$-)($- 

where 



|$-> = -y=(|0>A®|0) B -|l>A®|l> B ). 

This mixed state is disentangled because it is identical 
with a mixture of two direct product states 



p = ^\m\A®\m\B + ^\i){i\A®\m\B. 

But although the entanglement is now gone, Alice and 
Bob's qubits are still correlated. Indeed, p contains now 

1 bit of purely classical correlations; furthermore, these 
correlations are secret since they are not correlated with 
any third party, such as an eavesdropper. 

To also erase these classical correlations Alice has to 
"work" more. She can do this by randomly applying a 
"bit flip" to p, that is, applying at random, with equal 
probability either 1 or a x . This brings the state to 

p' = ~1 A ® \t B , 

where qubit A is completely independent from qubit B. 

To summarise, two bits of erasure (or, depending on 
the point of view, "bits of noise" , or "error" ) , are required 
to completely erase the correlations in the singlet. The 
first bit erases the entanglement and the second erases the 
classical secret correlations. We then say that the singlet 
contains 1 bit of pure entanglement, and 1 bit of secret 
classical correlations. The total amount of correlations is 

2 bits. 

A couple of remarks concerning the connection to Lan- 
dauer's theory of information erasure: just as Landauer 



2 



for information (entropy!), our approach quantifies corre- 
lations via their robustness against destruction. However, 
there seems to be a contradiction: whereas Landauer con- 
siders resetting the memory to a standard state (and we 
take for granted that one can generalise his argument to 
quantum memory), effectively exporting — "dissipating" 
— the entropy of the system, we inject entropy into it. 
This is actually only an apparent contradiction, as can be 
seen easily once we realise that in the above example we 
tacitly assumed that Alice forgets which Pauli operator 
she has applied. Indeed, we can present what she does 
in more detail as follows: she has a reservoir of random 
bits, which she uses to apply one of the Pauli operators 
as above in a reversible way (by a quantum-controlled 
unitary). This step does not affect the correlations be- 
tween A and B. Only when she decides to erase (forget) 
the random bits, the correlations are affected, as we have 
shown above. Now it is evident that the entropy pumped 
into the state is equal to the Landauer erasure cost of the 
random bits. 

In this paper we develop these ideas, as follows. 

For an arbitrary bipartite quantum state pab the 
quantum mutual information is defined as 

I(A : B) = S( PA ) + S( PB ) - S(p AB ). 

(The name is taken from Cerf and Adami [4], but 
Stratonovich |{| has considered this quantity already in 
the mid-60s.) 

While this definition is formally very simple, an op- 
erational interpretation for it was hitherto missing p 
(at least not for the quantity itself with given state; 
it plays however a crucial role in the formula for 
the entanglement-assisted capacity of a quantum chan- 
nel Q). We show here that the total amount of corre- 
lations, as measured by the minimal rate of randomness 
that are required to completely erase all the correlations 
in pab (in a many-copy scenario), is equal to the quan- 
tum mutual information. This is the main result of sec- 
tion [H] As an important consequence of this result we 
shall demonstrate that it leads to the strong subadditivity 
of von Neumann entropy. 

In our above example this amount of total correlations 
divides neatly into the amount required to obliterate the 
quantum correlations (1 bit), and the amount to take 
the resulting separable state to a product state (1 bit). 
We will follow on this in our discussion contained in sec- 
tion IIIII where we use this approach to define the quan- 
tum and the classical correlations in a state, and conjec- 
ture how they compare with the total correlations. 

Section II VI contains some observations how the total, 
quantum and classical correlation as defined here relates 
to other such measures. 

Then, in section [V] we extend our considerations to 
correlations (quantum and classical) of more than two 
players, after which we conclude. 

An appendix quotes the technical results about typical 
subspaces and our main tool, an operator version of the 



classical Chernoff bound, which are used repeatedly, as 
well as miscellaneous proofs. 

II. TOTAL BIPARTITE CORRELATIONS 

As explained in the introduction, we want to add ran- 
domness to a state p = pab of a bipartite system AB 
(with local Hilbert space dimensions d,A, Ab < 00) in 
such a way as to make it into a product state. In fact, 
we shall consider n — > 00 many copies of p, and be con- 
tent with achieving decorrelation (product state) approx- 
imately (but arbitrarily good in the asymptotic limit). 

In detail, the randomisation will be engineered by an 
ensemble of local unitaries, {pi, Ui <g> V^}^, to which is 
associated the randomising map 

N 

R:T^ y £ jPi {U i ®V i )T{U i ®V i f. (2) 
i=i 

We call the class of such completely positive and trace 
preserving (cptp) maps on AB "coordinated local unitary 
randomising" (COLUR). Considering that our object is 
to study the correlation between A and B, it may seem 
a bit suspicious to allow coordinated application of Ui 
and Vi at the two sites. Hence we define A-LUR to be 
those maps where all Vi = 1, and B-LUR those where all 
Ui — 1 — because they can be implemented by applica- 
tion of noise strictly locally at A or B alone, respectively. 
The combination of an A-LUR with a B-LUR map (i.e., 
independent local noise at either side) we call simply "lo- 
cal unitary randomising" (LUR). 

We say that R e-decorrelates a state p if there is a 
product state to a <S> ^>b such that 

\\R(p) - oj a O wbHj < e, (3) 

where || • || 1 is the trace norm of an operator, i.e. the sum 
of the absolute values of the eigenvalues. For technical 
reasons, when we study the asymptotics of such transfor- 
mations (i.e., acting on n copies of the state p), we will 
demand that the output of the map R (and similar maps 
studied below) is supported on a space of dimension d n , 
for all ro, with some finite d. 

How to account for the amount of noise introduced: 
from the point of view of the ensemble of unitaries, the 
most conservative option will be to take log AT, the space 
required to identify the element i uniquely. A smaller, 
and in the many-copy asymptotic meaningful, quantity 
would be H(p) = — YliPi ^°SPi- Note however that they 
are not uniquely associated with the randomising map 
R. However, Schumacher and earlier Lindblad Kjl, 
have proposed a measure of the entropy of a cptp map T 
injects into the system P on which it acts: for this pur- 
pose, one has to introduce an environment E, which is 
initially in a pure state, and to fix a reference system Z, 
which purifies pp to \tp)zp — note that all such purifi- 
cations are related via unitaries on Z. Then, the entropy 
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exchange is defined as 

S e (T, PP ) :=£((id z <8Tp)|^l). 

It is the entropy the environment (initially in a pure 
state) acquires in a unitary dilation of the cptp map. 
In this paper, P will be a bipartite system AB. 

Based on elementary properties of the von Neumann 
entropy, one can see that for every randomising map R 
as above, and every state p, 

logN>H(p)>S e (R,p). (4) 

Proposition II. 1 Consider any COLUR map on the bi- 
partite system A n B n , 

N 

R-.r^^PiiUi® VMUi ® Vi)\ 

i=l 

which e-decorrelates p® n . Then the entropy exchange of 
R relative to p® n is lower bounded 

S e {R,p® n ) >n(l(A:B)-3e\ogd-r)(3e)), (5) 

where 

[~loge for x > K 

In particular, the right hand side is also a lower bound 
on H(p), and even more so on log TV. 

Proof First of all, because R acts locally, 

N 

R A := Tr B R(p® n ) = ^PiUipfU}, 

i=l 

and similarly for Rb '■= Tr A R{p® n ). Hence we have 
(using the concavity of the von Neumann entropy) 

S(R A ) > nS(p A ), S(R B ) > nS{p B ), (6) 

On the other hand, we can argue that R(p® n ) is very 
close to Ra <8> Rb- Indeed, from eq. Q it follows that 

WRA-UA^ < WRip^-LJA®^]^ < e. 

Similarly, 

\\Rb - ojb\\ 1 <£- 
Thus, by the triangle inequality, 

\\Ra <8 Rb - ^a <8> < 2e, 

and we get 

WR^^-Ra^Rb^ <3e. (7) 
Hence, by the Fannes inequality [Icj . 
S(Ra) + S(R B ) - S(R(p® n )) <3elogd"+?7(3e). (8) 



Taking into account eq. JBJ we obtain 

S(R(p® n )) >n(S( P A)+S(p B )-3elogd-r)(3e)). (9) 

Here we use the fact that multiplying the last term in 
eq. ijHJl by n will only weaken the inequality. Now, in- 
troduce a purifying reference system Z for our state: 
p = Tiz ip, with a pure state ip — IV'XV'I on ZAB. Then 
the randomising map acts on A n B n , producing the state 

n = (idf " ® r) (v>®") 

on Z n A n B n . So, by definition of the entropy exchange, 

s e (R, P ® n ) = s , (iw>b») 

> 5(o A „ B „) - s(n zn ) 

= S(R(p® n )) - S(p® n ) 
> n(S(p A ) + S(p B ) - S(p) - 3elogd - 77(3e)), 

where in the second line we have used the Araki-Lieb (or 
triangle) inequality 11], and in the third line the fact that 
R acted only on A n B n , i.e. initially S(pz^) — S(pa™b™)', 
in the last line we have inserted eq. 10- □ 
On the other hand, we have: 

Proposition II. 2 For any state p and e > there exists, 
for all sufficiently large n, an A-LUR map 

i N 

R:t^— Y^{Ui ® *)T{Ui ® iy 

i=i 

on A n B n , which e-decorrelates p® n , and with 

logiV < n(l{A : B) +e). 

Proof For large n, we change the state p® n very little 
by restricting it to its typical subspace, with projector n 
(see appendix A), and even restricting the systems A n 
(B n ) to the local typical subspaces of pf™ (pf™), with 
projector Ha (n B ): 

p-.= (n A ® n B )np®"n(n A ® n B ). (10) 

By definition of the typical subspace projectors, 

using the "gentle measurement lemma" IA.2I 

From the properties of the typical projectors (see again 
appendix A) we obtain that p is an operator of trace > 1 — 
3e supported on a tensor product of (typical sub-) spaces 

of dimensions D A < 2 n ( s{pA)+e ) and D B < 2 n ( s{p " )+e ) , 
and such that 

_ 1 

p < — n A ®n B , 

where D = 2™( s(p) ~ e ) . It is for this latter property that 
we needed to put the global typical projector n in the 
definiton of p, eq. IjlUII . 
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For the following argument we will also need a lower 
bound on the reduced state on B, which we engineer 
by a further reduction: define the projection H' B on the 
subspace where Tr^ p> e/ ' Db, and let 

p:= {l A <E>n' B )p{l A <E>W B ). 

Then it is immediate that Tr p > Trp — e > 1 — 4e, hence 
by the gentle measurement lemma lA.21 



< 



\\P~ P\\i 

and we can keep for later reference the approximation 

(ii) 

Observe that we have defined all these projections in such 
a way that 

J B := Tr A p> -^—U'b- 

Now take any ensemble of unitaries, {p(dU), U}, such 
that for all state ip from the typical subspace of p®", 

p{dU)U<ptf = -j-n A =: tu A 

(a private quantum channel in the terminology of |l2jV 
for example, the discrete Weyl operators on the typical 
subspace of p®", but all unitaries on that subspace with 
corresponding Haar measure are good, as well. (The uni- 
taries can behave in any way outside the subspace.) By 
elementary linear algebra, 

)(dU)(U <g) l)p(tf <E> 1) = ui A ® lo' b . 

Now we show, using the "operator Chernoff bound" , 
lemma I A. 31 in appendix A, that we can select a small 
subensemble of these unitaries doing the same job to suf- 
ficient approximation (this is an argument like those used 
in [13J)- To this end, we understand Alice's local unitary 
U as random variable with distribution p(dU), and define 
the operator valued random variable 

X := D{U®t)p{U^ ® 1). 
By the above, < X < 1 and 

EX = Dlo a ®lo' b > e2-"( /(A:S)+3£ )lU <g> n^. 

Thus, if Xi, . . . , Xn are independent realisations of X, 
lemma I A. 31 yields 

< 2d A d B exp (-Ne2- n ( I{A '- B)+3t )e 2 /2 



where the factor 2 on the right hand side follows from 
adding the two probability bounds of lemma IA.3I For 

N = 2 ,l ( / ( j4 s ) +4e ) or larger (and sufficiently large n) 
this is smaller than 1, and we can conclude that there 
exist U\, . . . , Un from the a priori ensemble such that 



1 N 

{l-e)uj A ®Lu' B < — ^(J7 l (g)l)p(?7 I 0l) t <(l+e) 



UJ A <Z9U) B . 



Note, that it is enough to show that this probability is 
just smaller than one, i.e. that at least one such set of 
unitaries exists. 

Putting this together with eq. (|11|) . we get 



1 N 

-£(ui®i)p® n (oi®i: 



* - UJ A (g) J B 



< e + 8y/e, 



hence for the state lub ■= u' B j Ttlo' b , 



1 N 

-^81)^(^01)' 



UJA <K> LOB 



< 5e + 8V^- 



The last inequality shows that the map R we have con- 
structed, does indeed (5e + 8 v / e)-decorrelate p® n . □ 

Putting eq. |0J and propositions III . 1 1 and III . 21 together . 
we obtain the (robust) asymptotic measure of total cor- 
relation in a quantum state: 

Theorem II. 3 The total correlations in a bipartite state 
p A B, o,s measured by the asymptotically minimal amount 
of local noise one has to add to turn it into a product (let 
us denote this C el (p), the correlation of erasure of p), is 
I(A : B) — S(p A ) + S(pb) — S{p A s)- Mathematically, 

supliminf - min{S' e (i?, p® n ) : R e-decorr. COLUR] 

e>0 n 

= suplimsup — minjlog N : R e-decorr. A-LUR} 

e>0 n — ►co n 

= I(A : B). 

So, whether we allow general LUR ensembles or ones re- 
striced to A (or B), whether we count conservatively the 
size of the ensemble, \ogN, or be lax and charge only the 
entropy exchange, and whether we define the best rate 
optimistically or pessimistically, it all comes down to the 
quantum mutual information as the optimal noise (era- 
sure) rate to remove the total correlation. □ 

In passing we note that this implies the perhaps sur- 
prising result that the three ways of measuring the noise 
in eq. (@J, are asymptotically equivalent, as expressed in 
propositions III. II and III.2I In |b| the authors argue that 
the entropy exchange is a way of measuring the noise of 
a cptp map based on compressibility — it seems to us 
that the connection to that work is the following: while 
one can always change the basis of the environment to 
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interpret the entropy exchage as the "entropy of Kraus 
operators acting" , this change of basis will turn our ini- 
tially unitary Kraus operators into something else. We 
instead want to modify the cptp map so as to preserve the 
entropy exchange and unitarity of the Kraus operators. 

We now want to present a line of thought intended to 
reconcile our earlier doubts whether allowing coordinated 
LUR would be a well-behaved concept. This is based on 
the realisation that providing the players with the per- 
fectly correlated data i (with probability pi) is effectively 
giving them another state 7 = J2iPi\i)(i\A ® This 
gives us the idea of regarding the situation as a kind of 
catalysis; the task, for given (general) 7, is to decorrelate 
p ® 7, but we will have to discount the overhead C er (7) 
of just erasing the correlations in 7. 

So, we really want to consider the infimum (over all 
7), of the erasure cost of p (8 7 minus the cost of 7. Of 
course, in the light of our theorem 111.31 this is I (A : B) 
(which means that allowing catalysis does not change the 
content of our theorem). Conceptually, however, we gain 
an insight: supposing we allow only LUR in the ran- 
domisation, then giving the parties a perfect correlation 
7 allows them the following strategy: they use the per- 
fect correlation to implement a general COLUR map to 
erase the correlations in p and after this the one in 7. We 
don't need to know how much the latter costs because we 
subtract the same cost anyway. 

Thus, even though we may be restricted to LUR at 
first, the availability of appropriate 7 in a catalytic 
scenario effectively motivates consideration of general 
COLUR maps. It is a nice observation, though, that 
in theorem lII.3l we can locally restrict to A-LUR without 
the need to resort to catalysts. 

Remark II. 4 It may be worth noting that our lower 
bound in proposition lII.il is valid for an even larger class 
of operations, namely "local unital" (LUN) cptp maps: 
these are compositions of unital (i.e., identity preserv- 
ing) maps locally at A and at B. This is because all 
we need for the argument is that the local entropies of 
Alice and Bob can only increase under the map, which 
is exactly the property of unital cptp maps; the rest of 
the proof is the same (observe in particular that entropy 
exchange makes sense for whatever cptp map we have, 
not just mixtures of unitaries!). Cleary LUR is a sub- 
set of LUN, and we can even emulate COLUR maps by 
including catalysis in the sense of the previous remarks. 

We can interpret this result intuitively using our ex- 
planation of our approach in terms of (reversible) local 
unitaries and Landauer erasure, as given in the introduc- 
tion. Namely, it is well-known that unital maps T are 
exactly those which admit a dilation 

T(<p) = Tie (u U <8> . 

Hence, the local unital maps of Alice and Bob can be 
understood as reversibly interacting their registers with 



local noise, and subsequent erasure of that noise. The 
cost of the latter is bounded by the entropy exchange. 

Corollary II. 5 (Strong subadditivity) For any tri- 
partite state pabc, 

I(A : C\B) = S{ PAB ) + S(p B c) 

- S(pABc) ~ S{p B ) > 0. 

Proof The strong subadditivity inequality as expressed 
above is equivalent to 

I(A : BC) > I (A : B). 

However, by theorem lII.3l above. the left hand side is the 
minimum local noise necessary and sufficient to asymp- 
totically decorrelate A from BC, and we may consider 
an A-LUR for this, i.e., randomisation acting only on 
A. Since a map which e-decorrelates pa\bc surely also 
e-decorrelates pab , this minimum noise is larger or equal 
than the minimum noise to decorrelate the latter state, 
which is the right hand side, once more by theorem III. 31 
Observe that the proof of theorem III. 31 did not in- 
voke strong subadditivity: in the lower bound, proposi- 
tion we have only used concavity (Schur convexity) 
and subadditivity of the entropy; in the upper bound, 
proposition lll.2l only typical subspaces and random cod- 
ing were employed. □ 

Remark II. 6 While it is worth noting that in our noise 
model we have not allowed communication between the 
parties, and that indeed (and unsurprisingly) communi- 
cation can decrease as well as increase the total corre- 
lation, our result shows that the total correlation C{p) 
is indeed monotonic under local operations and public 
communication (LOPC) in the following sense. 

Every LOPC is a succession of steps of the form that 
Alice (Bob) performs a quantum instrument |l5j locally, 
transforming the state p into an ensemble {pi,Pi}, of 
which she (he) communicates i to the other party. In 
general, such local quantum instrument can be charac- 
terized by adding an ancillary system A' on, say, Alice's 
side and letting A' interact with an original subsystem 
A. Thus, the transformation 

PAB ® PA' 1 ► 22pi\i)(i\A' <8> {Pi)AB =■ °~AA'B 

i 

is implemented locally by a cptp map. By adding a local 
ancilla Alice cannot change the quantum mutual infor- 
mation between her and Bob, i.e. initially 

I(AA> : B) PAB ® PA , = I (A : B) PAB (12) 

On the other hand, 

I(AA' : B) PAB ® PA , > I(AA' : B) aAA , B 

= I(A' :B) a +I(A:B\A% 
>I(A:B\A') a (13) 

= J2PiI(A:B) {pi) , 
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where in the first line we used monotonicity of / under 
local operations, in the second line we used the formal 
"quantum conditional mutual information", and in the 
third and forth we used standard properties of the von 
Neumann entropy. Combining eqs. (|12|l and l|13l) we ob- 
tain 

I(AA':B) PAB >Y,PiI{A-B) Pi , 

i 

The expression on the right is the average of the total 
correlations after the instrument. We can interpret this 
as the correlation between Alice and Bob conditional on 
an eavesdropper who monitors the classical communica- 
tion between them; in this way the common knowledge 
of the classical message i does not count as correlation 
between Alice and Bob. 



III. BIPARTITE ENTANGLEMENT AND 
CLASSICAL CORRELATIONS 

A. Quantum correlations 

Now we use the same method of randomisation to de- 
fine an entanglement measure. It will be the minimum 
noise one has to add locally to a state p to make it a sep- 
arable state a. Of course, as in section [H] we will adopt 
an asymptotic and approximate point of view: 

To the disentanglement process we associate the ran- 
domising map R as in eq. We say that R e- 
disentangles a state p if there is a separable state a = 

Ylf,, % a A ® a B sucn tnat 

\\R(p) - 4, < e. (14) 

As in section [H] we can (and will) restrict ourselves 
to LUR, keeping in mind that the appropriate 7 in a 
catalytic scenario will easily motivate a generalization to 
COLUR maps. 

In the previous section there was an undercurrent mes- 
sage that the minimum noise we have to add is the (min- 
imal) entropy difference between the state and the target 
class. There it was product states achievable by LUR; 
here we will aim at separable states achievable by LUR 
(up to e-approximations). In detail, we can prove: 

Proposition III.l Let T be an e- disentangling map for 
p® n . Then, 

logN>H(p)>S e (T,p® n ) 

-„ J n J ^ (S(a)-nS(p)-nelogd-r](e)), 

where the infimum is over all COLUR maps R and sep- 
arable states a with \\a — R(p® n )\\i < e. 

Proof Just as in the proof of proposition lll.il we intro- 
duce a purification tp of p on the extended system Z AB; 
the randomising map acts on A n B n , resulting in the state 

0= (idf n ®T)(V>® n ). 



As before, by the definition of the entropy exchange, 

S e (T,p® n ) =S{£l ZnAnB n) 

> s{si A "b~) - s(n zn ) 

= S(T(p® n )) - S(p® n ) 

> S(a) — nS(p) — nelogd — f?(e), 

where in the second line we have use the triangle inequal- 
ity and in the third line the fact that R acted only 
on A n B n ; in the last line we have substituted the sepa- 
rable state a with || a — T{p® n ) ||i < e, which exists by 
assumption, and have used the Fannes inequality. □ 

Proposition III. 2 Let k > and T be a COLUR map 
such that a '■= T(p® k ) is separable. Then for all e and 
sufficently large n there exists an e- disentangling COLUR 
map R as in eq. 0), with 

log TV <n(S{a)-kS(p) + e). 

Proof We assume the form of eq. © for the map T. 
To begin with, we have for all n, 

r 8n (p Wn ) = a® n , (15) 

which is separable. Our goal will be to construct a 
COLUR map with the desired properties, which approx- 
imates T®". 

To this end, we use a typical projector III of p® kn and 
a typical projector II2 of a® n : for sufficiently large n, 
the right hand side is changed by not more than e if we 
sandwich the state between II2, and the left hand side is 
changed by not more than e if we replace p® kn by p '■— 
nip «.fcn ni (This ^ tn e effect of making p < ^-Ui, 

with L>! > 2 kn< y s{p) - e ) .) Hence, 

a:=n 2 (T®"0?))n 2 

satisfies \\a - cr®"^ < 2e. 

Since a is supported on a subspace of dimension D2 = 

T1-II2 < 2 n(5 (' T)+e ), we alter it again only by not more 
than e if we restrict it to the subspace where it is > 
e/D; denote the corresponding projector II3 and let a :— 

n 3 an 3 . 

Now we are in a position to use the operator Cher- 
noff bound once more: we understand the ensemble of 
unitaries defining T® n , 

W = Ui ® Vi = (U h <g> • • • ® U in ) ® (V h ® • • • ® V in ) 

as a random variable with probability density p(W) = 
Pi = Pit " 'Pi n - Now we can define random operators 

X := MsIWOo^IIalls, 

which by the above obey < X < 1, and 

ex = D{p> e— n 3 > e 2-™( s ( CT )- fcS (p)+ 2£ )n 3 . 

D2 
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Hence, for independent realisations X±, . . . , X^ of X, 
lemma IA.3I gives 

< 2<Texp ( Ne2 - n { s ^)- kS (p)+^)^/2 



Hence, for N = 2 n ( s ^- ks ^+ 3e ) (and sufficiently large 
n), this probability is less than 1; this means that there 
are unitaries Wi , . . . Wn form the original ensemble of 
product unitaries, such that 

1 N 

( 1 - e ) 5 ^E vl ^ t ^( 1 + e ) ? - 



This statement, however, yields 

N 

X 



1 



i=i 

and we are done. 



< 4e, 



□ 



where the infimum is over all COLUR maps R and sep- 
arable states a; 

E eT (p) < liminf inf — S(o~) — S(p), 

with the infimum is again over all COLUR maps R and 
separable states a . □ 

We conjecture (without proof, at the moment) that 
the two limits on the right hand side coincide. Note that 
the main difference (apart from the uses of liminf and 
limsup) is that in the one we consider maps taking the 
original state to perfectly separable states, while in the 
other we still allow e-approximations (which is why we 
need to include the e in the formula). If this conjecture 
turns out to be true we have warranted our intuition 
from the beginning of this section that the entanglement 
erasure is the minimal entropy one has to "add" to the 
state to make it separable. 

It remains as a major open problem to prove this con- 
jecture, and perhaps to find a single-copy optimisation 
formula for the entanglement erasure E er . 



B. Classical correlations 



Remark III. 3 By the same proof technique as in propo- 
sitions III. 21 and IIII.2I one can show that for many inde- 
pendent copies of a COLUR map T (acting on as many 
copies of a state p), the entropy exchange has, in the 
asymptotic limit, the actual character of a classical en- 
tropy rate, in the following sense: the action of the map 
T®n on a purification of p® n is approximated by a dif- 
ferent COLUR map with N terms, where 



\o S N<n(S e (T, p) + e). 



a 



These two propositions can be summarized in the fol- 
lowing theorem. Let us define, for given state p, integer n 
and e > 0, N(n, e) as the smallest N such that there ex- 
ists an e-distentangling COLUR map as in eq. (J2J) . Then, 
the entanglement erasure of p is defined as the minimal 
asymptotic noise rate needed to turn p into a separable 
state: 

E el (p) := sup limsup — log N(n, e). 

e>0 n — >oo Tl 

As usual in unformation theory, we also define the opti- 
mistic entanglement erasure by replacing the limsup by 
the liminf in the previous formula: 

E C I (p) :— sup liminf — log N(n, e). 

£>0 n ~ >oc n 

Theorem III. 4 For all bipartite states p = pab, 
MLrip) > SU P limsup inf —S(a) — S(p), 

e>0 jwoo lk-H(p®™)l|l<e n 



Now we want to use the same approach to define 
and study the classical correlation content of a quan- 
tum state. The intuitive idea here is that what is left of 
the correlations after erasing the quantum part ought to 
be addressed as the classical correlations. In particular, 
a separable state has no quantum correlations, so its to- 
tal correlation (quantum mutual information) should be 
addressed as classical correlation. 

This motivates not one, but two definitions of classical 
correlations. In the one we consider separable states a 
such that there exists an LUR map R such that 

(a) \\R(p® n ) - <e, 

in the other, any local cptp map T = Ta <8> Tg with 



(b) 

Then let 



|T(p« 



< e. 



C£ cr (p) := sup limsup sup 

e>0 n^oo & s.t. (a) 



-I(A : B) 



C£* T (p) := sup limsup sup — I(A : B) a . 

e>0 rwoo a s.t. (b) n 

In words, C£ el (p) is the largest asymptotic total era- 
sure cost of (near-) separable states accessible from many 
copies of p by LUR, while C£* r (p) extends the maximisa- 
tion over all states accessible by arbitrary local operations 
(but, as in LUR, no communication or correlation). 

Of course, we use the quantum mutual information 
to measure the total correlations of the resulting near- 
separable state, because of theorem III. 31 There are also 
"optimistic" versions of these definitions, denoted C£ c r 
and C£* TJ by replacing the limsup by liminf; but here 
we will not talk about these variants. 
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C. The pure state case 

For a bipartite pure state, if> = \tp)(ip\ 7 |t/j) = 
SiV^7H)N) m Schmidt form, the total correlation is 
I (A : B) = 2S(tp A ) = 2E(ip) = 2H(X) (with ip A = 
Trs^), i.e., twice the entropy of entanglement. We will 
show that both the quantum and the classical correla- 
tions are equal to E(ip) = H(X), the entropy of entan- 
glement. This is to be expected in the light of our in- 
troductory example and from the fact of entanglement 
concentration |3J: indeed, for many copies of tj), both Al- 
ice and Bob can, without much distortion of the state, 
restrict to their respective typical subspaces, and share a 
state which is pretty much maximally entangled, at which 
point the reasoning of the introduction should hold. 

In rigorous detail, both Alice and Bob have typical 
subspace projectors and Ub for their reduced states 
tp® n and ip% n , respectively, according to lemrna lA.il in 
the appendix. Because of that result, we have that 
Tr(^® n n^ <g> IVb) > 1 — e for large enough n, and the 
state |$) := U A <g> IV B \i}j)® n has Schmidt-rank D < 
2"(S , (i/>A)+e)_ Q n j- ne t ner hand, by the gentle measure- 
ment lemma lA~2l ||$ - V®"^ < V8e =: S. 

Now a pure state of Schmidt-rank D can always be 
disentangled by a local phase randomisation using D 
equiprobable unitaries: if 1$) = ^ . -\/fj\j) A \j)B, $ = 
|$)($|, we let U k := £\ e 2 ^ k / D \j)(j\, and have 

1 ° 

- J2(u k ® iMUk ® i) f = E /ib'XiU ® \j){j\b. 

k=l J 

Hence, applying this same randomisation map to -0®" 
will 5-disentangle this state. 

On the other hand, let an e-disentangling map R for 
■0®" be given. Then, just as in the proof of proposi- 
tion nm 

logN>H(p)>S e (R,i>® n ) 

> S(Rty® n )) - S*(V^ n ) 

> S(a) - nelogd- 77(e) - 

> S(cta) — ne log d — 97(e) 

> S(R{<ip® n ) A ) -2nelogd-2?7(e) 

> S(ip% n )-2ne log d- 2n(e) 
>n(Sty A )-2elogd-2r)(e)), 

using, in this order: the triangle inequality in the sec- 
ond line, then the Fannes inequality (with the separable 
state a which we assume to exist e-close to R(ijj® n )), 
then the inequality S{oab) > S{<r A ) for separable states 
(this is implied by the majorisation result of 16]), then 
the Fannes inequality once more and finally the fact that 
the local entropy can only increase since we use a locally 
unital map. 

Letting n — > 00 and e — > 0, these considerations prove 
that E eT ty) = E cr ty) = Ety) = Sty A ). 



By a similarly simple consideration, we can also calcu- 
late the classical correlation of t/j (up to one only conjec- 
tured entropic inequality): 

First, by simply locally dephasing the state ip® n in 
its Schmidt basis, we can obtain a separable, perfectly 
correlated state cr®", which has as its quantum mutual 
information 

I (A : B)^ n = Styf 1 ) = nEty). 

On the other hand, to show that this is (asymptotically) 
optimal, we need to consider local operations T A and Tb 
(now completely general, in the spirit of the definition 
of Ct eI ), such that t = (T A ® T B )(ip m ) is close to a 
separable state a: ||r — a\\i < e. 

By implementing the local operations as local unitaries 
U A , Ub, with ancillas which we keep for reference (com- 
pare figure 1), we preserve the purity of the overall state: 
the output state |i?> = (U A ® f7 s )(|0) a |^)|0) b ) is the pu- 
rification of t. Hence (by Uhlmann's theorem) there is a 
purification \() of a such that \\"d — C||i < e', with a e' uni- 
versally dependent on e |17|. Now, invoking the Fannes 
inequality a couple of times, 

nE(ip) = J5(i?) 

> E(() - ne' log d-r)(e') 

>I(A 1 :B 1 ) a -ne'\ogd-r l {e') 

> I{A X : B t ) T - n(3e + e') \ogd- 377(e) - r)(e'). 

Division by n, and letting n — » 00 (such that e, e' — > 0), 
yields the claim that the mutual information rate can 
asymptotically not exceed E(ip). The inequality marked 
* in the third line we were not able to prove rigorously 
(it is easily seen to be true in a great number of cases) — 
it is codified in the following conjecture, which we think 
is very plausible. 

Conjecture III. 5 For pure entangled state ip = ipAB, 
and local operations T A , Tb- such that a = (T a <E>Tb)(i/j) 
is separable, 

I (A : B) a < Ety). 



The major difficulty of proof stems from the fact, that 
Alice and Bob may use quite general local operations if 
their goal is to maximise the classical correlation, e.g. 
they may apply local unitaries involving ancillas, i.e. en- 
large Hilbert space (see figure 1). If they don't do this, 
let's say for example that Alice acts only on her typi- 
cal subspace: then she cannot increase her local entropy 
above n(S(i/j A ) + e), which also is an upper bound for 
the mutual information of the separable state. In gen- 
eral, of course, we would like to be able to avoid such an 
assumption, and indeed the feeling is that going out of 
the typical subspace is suboptimal anyway. 
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FIG. 1: One can locally implement the cptp maps Ta and Tb 
using ancillas and unitaries. These unitaries rotate the initial 
pure state \tp) to a pure state |C) = (Ua ® UB){\0) a \^>)\0)b), 
which hence has the same entanglement as The conjec- 
ture is thus a statement about the pure state (,: relative to 
it states that I{A 1 : B x ) < S{A 1 A 2 ). 



IV. GENERAL PROPERTIES OF QUANTUM 
AND CLASSICAL CORRELATION; 
COMPARISON WITH OTHER 
ENTANGLEMENT MEASURES 

A. Total correlation 

About the total correlation C CI (p) — I (A : B) p of 
a state we know most, primarily so because we have a 
usable formula. For example, because of strong subad- 
ditivity, it is monotonic under local operations, and in 
remark 111.61 we have already argued that monotonicity 
extends to local operations and public communication. 

Again because of its coinciding with the quantum mu- 
tual information, we can easily relate the total correlation 
to distillability measures of quantum states, namely total 
distillable correlation, distillable secret key and distillable 
entanglement (which are decreasing in this order): 

I(A : B) > CRO) > K(p) > E D (p). 

(for the second quantity, the common randomness CR(p) 
in a state, see [l8j; for the third and fourth, the distillable 
key K(p) and the distillable entanglement E]j(p), see the 
recent results in |l9j'). 



B. Quantum & classical correlations 



Our theorem lIII.4l narrows down the entanglement era- 
sure up to the regularisation and getting rid of e. This 
is not good enough to decide any of the properties we 
would like an entanglement measure to have — in the 
first place, monotonicity under local operations and clas- 
sical communication. Similarly, we don't know how to 
prove or disprove convexity of E cr (a situation much in 
contrast to the total correlations). 

On the other hand, these properties are easily seen for 
the second variant of our classical correlation quantity, 



C(* cv : it is monotonic under local operations (no commu- 
nication allowed, of course), and it is convex. 

Once more, we have at present little to offer in terms 
of comparing the erasure (quantum and classical) cor- 
relation measures to other quantifications of entangle- 
ment and classical correlation; clearly, we would like E CI 
to be an upper bound on the distillable entanglement, 
and some version of the classical correlation to be an 
upper bound on the distillable secret key. It has been 
suggested (2t| |^ that the (regularised) relative entropy 
of entanglement should relate to the entanglement era- 
sure — while this would be a most interesting result, we 
see no clear evidence either way. 

An interesting question arises when we return to the 
pure state example of the introduction, where the total 
correlations could be erased neatly in two steps: first by 
adding the minimal noise to dephase the state, and then 
going on from there adding noise to classically decorrelate 
it. We have seen that for pure states this is so generally, 
even for the asymptotic cost. But a priori, the defini- 
tions of quantum correlations E CT and classical correla- 
tions C£ el require us to target quite different separable 
states; figure 2 illustrates this point. 

Hcuristically: for an initial state p, we could have a 
strategy of adding local noise to turn it into a separable 
state a (or close to). Theorem IIII.4I indicates that the 
cost will asymptotically be the entropic gap between a 
and p: S(a) — S(p). In the spirit of the introductory 
example, we then got further and completely decorrelate 
<r; according to theorem III. 31 this will cost I(A : B) a = 
S(cta) + S(<tb) — S(a) bits of noise. Hence the total cost 
of this two-step process will be 

S(a A ) + S(* B ) - S(p), 

whereas if we had destroyed the correlations in one go, 
we would have spent noise amounting to 

I(A:B) p = S( PA ) + S(p B )-S(p), 

which is in general smaller. We have equality if the op- 
timal disentangling map does not increase the local en- 
tropies (or, in the asymptotic picture, only by a sublinear 
amount). While this seems reasonable to expect, we have 
no argument in favour of this expectation. 

Finally, is it true that the quantum correlation, mea- 
sured by the entanglement erasure E eT , is always smaller 
or equal to the classical correlation? Our and perhaps 
the reader's intuition would answer yes, but to prove this 
from our definitions seems not obvious. 



V. MULTIPARTITE CORRELATIONS 

By obvious generalisations of the approaches presented 
in the previous two sections one can also easily define to- 
tal correlation and entanglement measures for more than 
two parties in the many-copy limit. 
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FIG. 2: Starting from p, this figure illustrates the different 
objectives one has when considering (i) the total correlations, 
(ii) the quantum correlations, and (iii) the classical correla- 
tions. For this purpose we have ignored the subtleties of the 
asymptotics, and symbolise the noise required to go from one 
point in state space to another by their distance. Then for 
(i) we seek the shortest way (minimal noise) from p to the 
manifold of product states (and we expect the target tv to be 
~ Pa®Pb)\ for (ii) we seek instead the shortest way from pto 
the convex set of separable states, and going to the optimal 
point o\ and from there on to a product state 7Ti may in total 
yield a suboptimal erasure procedure. Finally, for (iii), we 
want to go from p to a separable state 02 of maximal correla- 
tion (^distance from product states). Even if the transition 
from p to <T2 is done by a local randomising map, it could be 
that the noise cost is significantly larger than that of going 
from p to a\. 

For pure state p = tp we have argued in subsection 1111 CI that 
all three optimal paths coincide, and that in fact E eI (ip) — 
Ct eT {i,) = \G et {i>)=E{i,). 

We don't want to go into too much detail here but 
discuss an aspect of the total correlation measure C el (p) 
of a state pai...a p of p parties: 

By easy generalisations of propositions III. II and III. 21 
(and remark Hi. 4(1 . one obtains that 

p 

C ei (p) = J2S(A i )-S(A 1 ...A p ). (16) 

As before, this asymptotic measure does not not depend 
on the details of definition, and we find a generalisation 
of the fact that the randomisation can be performed by 
one party alone in the bipartite case: the parties can 
decorrelate themselves locally one by one from the rest, 



and the individual costs add up to C cr of eq. (|16fl . In 
detail: let A\ decorrelate herself from A2 . . . A p using 
I(Ai : A2 . . ■ A p ) bits of randomness (by theorem 111.3(1 : 
then let A2 decorrelate himself from A3 . . . A p using 
I{A 2 : A 3 . . . A p ); etc. Then adding up these quantities 
yields obviously the right hand side of eq. (|16|l . 

VI. DISCUSSION 

In this paper we have addressed the problem of an op- 
erational definition of the total, quantum and classical 
amount of correlation in a bipartite quantum state. We 
have shown that the above quantities can be defined via 
the amount of noise that is required to destroy the cor- 
relations. 

We have proved that the total correlation in a bipartite 
quantum state, measured by the asymptotically minimal 
amount of noise needed to erase the correlation, equals 
the quantum mutual information I (A : B). Thus, our 
approach gives the first clear operational definition of 
I(A : B) for any given state. This even lead to an op- 
erational proof of strong subadditivity; it is an interest- 
ing question whether the equality conditions derived re- 
cently can be derives in this way, too. 

Then we extended our approach to definitions of the 
quantum (entanglement) and classical correlation con- 
tent: after definitions of these quantities in the spirit of 
erasure, by the noise needed to destroy the entanglement, 
and the maximum correlation left after destroying the en- 
tanglement, we proved partial results on these quantities, 
and related them to other entanglement and correlation 
measures. In that context, we also put forward the con- 
jecture that the amount of quantum correlations is always 
at most as large as the amount of classical correlations. 
For pure states we have verified, up to a plausible con- 
jectured information inequality for separable states, that 
the proposed quantum and classical correlation measures 
coincide with the entropy of entanglement. In general, we 
had to leave open the questions of LO(CC) monotonicity 
and convexity of E cr and C£ CI . (That C£* r is monotonic 
under local operations is, however, trivial from the defi- 
nition!) 

The reader who is acquainted with the work of the 
Horodecci, Oppenheim and Sen j2^| will sense that there 
is a relation between their "thermodynamical" approach 
to correlations via extractable work (=purity), and ours, 
even though superficially we seem to go in opposite di- 
rections: we consider the entropy increase necessary to 
destroy correlations — and this directly gives a corre- 
lation measure; in the approach of |23| the purity con- 
tent decreases as one restricts the set of allowed oper- 
ations, and the "total correlation" appears as a deficit 
between global operations and local operations. If one 
allows also communication, the deficit is a quantum cor- 
relations measure. Recently, however, these authors have 
been able to relate this latter deficit to the entropy pro- 
duction when turning the given state into a product via 
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certain LOCC maps 20]. Via Landauer erasure, this now 
looks a lot more like our model, and inded it seems to be 
the case that by their including classical communication, 
poj allows for a wider class of destructive operations, and 
consequently, the resulting entanglement measure is no 
larg er than our E el . This makes the lower bound from 
|20| applicable, yielding that the entanglement erasure 
is lower bounded by the relaive entropy of entanglement 
(with respect to the separable set). It remains to be in- 
vestigated whether there is indeed a gap between them 
or whether the difference is washed out in the asymptotic 
limit involved in both definitions. 
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APPENDIX A: TYPICALITY. 
OPERATOR CHERNOFF BOUND. 
MISCELLANEOUS RESULTS 

From [24[ we cite the following definitions and proper- 
ties of typical subspaces: 

For the state density operator p choose a diagonalisa- 
tion p = 'Y^ i Pi\i){i\ (such that S{p) = H{p)). Then, with 
I = ix . . . i n and 

Pi =Ph ■ ■ -Pi n) 

= |iiK*l|«"-®|*nK*n|, 

p ®n = J2 I Pi\I)(I\- We call (with e > fixed implicitly) 
a state \I) typical, if 

|— \ogpi — nS(p)\ < en. 

We define the e-typical subspace to be the subspace 
spanned by all typical states, and n to be the orthogonal 
projector onto the typical subspace (n and e as before 
implicit). 



The following theorem states the properties of the typ- 
ical subspace and its projector II (which can easily be 
proved by the definitions and the law of large numbers): 

Lemma A.l (Typical subspace theorem) For any 

state p, integer n and e > let U the typical subspace 
projector. Then: 

• For all 6 > and sufficiently large n, 

Tr(p®™n) > 1 - 5. 

In other words, by enlarging n the probability of p 
to be found in the typical subspace can be made as 
close to 1 as desired. 

• For sufficiently large n, the dimension of the typical 
subspace equals Tr II, and satisfies 

2n(S(p)-e) < Xrll < 2"( s ( p ) +e ). 
Indeed, for all n, 

2«(s(p)-e)jj < np®™n < 2™( s ( p ) +e * ) n. 



Lemma A. 2 (Gentle measurement [25]) Let p a 

density operator with Tr p < 1, and X an operator with 
< X < 1, such that Tr pX >Tip — 8, then 

Wp-VXp-Jx]^ < V86. 

(The factor 8 can be improved to 4: see \2A] .) Here the 
operator order is defined by saying that X >Y iff X — Y 
is positive semidefinite. This is a partial order. The 
interval [A; B] is defined as the set of all operators X 
such that A < X and X < B . 

Furthermore, we shall make use of the following result: 



Lemma A. 3 (Operator Chernoff bound [27]) Let 

Xi,...Xn be i.i.d. random variables taking values in 
the operator interval [0; i] C B(C d ) and with expectation 
¥,Xi = M > pi. Then, for < e < 1, and denoting 

A - N l^i=\ A *> 

Pr{X £ (1 + e)M} < d exp (-N^\ , 
Pr{X £ (1 - e)M} < d exp (-N^- 
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